PE-Assembler: de novo assembler using short paired-end reads
نویسندگان
چکیده
MOTIVATION Many de novo genome assemblers have been proposed recently. The basis for most existing methods relies on the de bruijn graph: a complex graph structure that attempts to encompass the entire genome. Such graphs can be prohibitively large, may fail to capture subtle information and is difficult to be parallelized. RESULT We present a method that eschews the traditional graph-based approach in favor of a simple 3' extension approach that has potential to be massively parallelized. Our results show that it is able to obtain assemblies that are more contiguous, complete and less error prone compared with existing methods. AVAILABILITY The software package can be found at http://www.comp.nus.edu.sg/~bioinfo/peasm/. Alternatively it is available from authors upon request.
منابع مشابه
A Comprehensive Study of De Novo Genome Assemblers: Current Challenges and Future Prospective
Background Current advancements in next-generation sequencing technology have made possible to sequence whole genome but assembling a large number of short sequence reads is still a big challenge. In this article, we present the comparative study of seven assemblers, namely, ABySS, Velvet, Edena, SGA, Ray, SSAKE, and Perga, using prokaryotic and eukaryotic paired-end as well as single-end data ...
متن کاملPERGA: A Paired-End Read Guided De Novo Assembler for Extending Contigs Using SVM and Look Ahead Approach
Since the read lengths of high throughput sequencing (HTS) technologies are short, de novo assembly which plays significant roles in many applications remains a great challenge. Most of the state-of-the-art approaches base on de Bruijn graph strategy and overlap-layout strategy. However, these approaches which depend on k-mers or read overlaps do not fully utilize information of paired-end and ...
متن کاملCrystallizing short-read assemblies around lone Sanger reads
New short-read sequencing technologies produce large volumes of 25-30 base paired-end reads. In this paper, we present a sequencing protocol and de novo assembler program (SHORTY) targeted towards such microread data. Our protocol augments short-paired reads using a trivially small number of Sanger reads (only one to three reads per bacterial genome). Still, these “seed reads” enable us to prod...
متن کاملAn extended genovo metagenomic assembler by incorporating paired-end information
Metagenomes present assembly challenges, when assembling multiple genomes from mixed reads of multiple species. An assembler for single genomes can't adapt well when applied in this case. A metagenomic assembler, Genovo, is a de novo assembler for metagenomes under a generative probabilistic model. Genovo assembles all reads without discarding any reads in a preprocessing step, and is therefore...
متن کاملPebble and Rock Band: Heuristic Resolution of Repeats and Scaffolding in the Velvet Short-Read de Novo Assembler
BACKGROUND Despite the short length of their reads, micro-read sequencing technologies have shown their usefulness for de novo sequencing. However, especially in eukaryotic genomes, complex repeat patterns are an obstacle to large assemblies. PRINCIPAL FINDINGS We present a novel heuristic algorithm, Pebble, which uses paired-end read information to resolve repeats and scaffold contigs to pro...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Bioinformatics
دوره 27 2 شماره
صفحات -
تاریخ انتشار 2011